|
1. Albiero A, Vitulo N, Forcato C, Campagna D, Caniato E, Bilardi A, Schiavon R, D'Angelo M, Zimbello R, Valle G Bioinformatic analisys of SOLiD transcriptome data Meeting: BITS 2009 - Year: 2009 Full text in a new tab Topic: Novel methods and algorithms Abstract: Missing |
2. Bertocco E, Cannata N, Toppo S, Fontana P, Scannapieco P, Valle G From sequence to function using links to ortholog genes Meeting: BIOCOMP 2001 - Year: 2001 Full text in a new tab Topic: Abstract: Missing |
3. Bilardi A, Campagna D, Campanaro S, Cestaro A, Levorin F, Vitulo N, Vezzi A, Valle G, Cannata N Quest for rho dependent terminators in prokaryotic genomes Meeting: BITS 2005 - Year: 2005 Full text in a new tab Topic: Unspecified Abstract: In prokaryotes are known two kinds of transcription terminators that are distinguished by their mechanisms and DNA sequences. When RNA polymerase encounters an intrinsic terminator (RIT), it can release the nascent RNA spontaneously, but when it encounters a Rho dependent terminator (RDT), the release of the RNA depends on the action of a protein factor called Rho. RDT are involved in the gene expression as attenuators in the leader or intra operon and as terminators at the end of operons. A RDT consists of three distinct parts, which together extend over 150-200bp of DNA (figure 1). The upstream part, called the Rho utilization (rut) site, encodes a segment of the nascent transcript to which Rho can bind and is essential for starting termination. The central part, that we called Rho activity (rac) sequence, is the second mRNA binding site to which Rho can bind and is essential for helicase/traslocation activity. The downstream part, called the transcription stop point (tsp) region, is where RNA polymerase pauses during elongation in the absence of Rho. In the literature is present only a small number of studies of single RDT, very little is know about their structure and sequence and is not existing any in silico predictive method. |
4. Campagna D, Romualdi C, Vitulo N, Favero M, Lexa M, Cannata N, Valle G RapH and RapD: two indexes designed for de novo identification of repeats in whole Meeting: BITS 2005 - Year: 2005 Full text in a new tab Topic: Unspecified Abstract: The identification of repeats is an essential step for genome analysis and annotation, but is not easy because repeats tend to be little conserved during evolution. This particular aspect of repeats makes very difficult the identification of homologous sequences that diverged significantly, both within the same genome and between genomes of different organisms. |
5. Campagna D, Valle G Development and usage of a new bioinformatic strategy for identifying repeated sequences in genomic DNA Meeting: BIOCOMP 2002 - Year: 2002 Full text in a new tab Topic: Abstract: Missing |
6. Caniato E, Vezzi A, Albiero A, Campagna D, Schiavon R, D'Angelo M, Zamperin G, Forcato C, Vitulo N, Valle G De novo assembly combining SOLiD mate-pair and 454 data Meeting: Proceedings of BITS 2010 Meeting - Year: 2010 Full text in a new tab Topic: New tools for NGS Abstract: Missing |
7. Cannata N, Dioguardi R, Fontana P, Scannapieco P, Toppo S, Lanfranchi G, Valle G An integrated knowledge-base of gene expression in human skeletal muscle Meeting: BIOCOMP 2000 - Year: 2000 Full text in a new tab Topic: Databanks Abstract: We have build a solid scaffolding that can hold and connect muscle transcript sequencing data to functional data, expression profiles, genomic sequences and genetic diseases. The starting point is the wide collection of skeletal muscle ESTs produced at CRIBI, which are automatically analysed, filtered and stored in a SQL table (HSPD-EST). A schematic view of the organization of the data is shown in the figure. ESTs are assembled into clusters (HSPD-CLUSTER table), which are very transitory entities as they may change at every new assembly depending on the order that the ESTs were merged or on the presence of new variant isoforms determined by alternative splicing or paralogue genes. On the other hand, many transcripts have now been well characterised and therefore should be considered as stable entities. Therefore, we decided to implement a Transcript Integrated Table (TRAIT) of human skeletal muscle, that includes some of the established information that is already available. As can be seen in the figure, we have also implemented a Single-Transcript Integrated Table (STRAIT), where different transcripts are stored in different records, even if they come from the same gene, for instance after alternative splicing. Therefore, every single transcript is recorded in STRAIT, while TRAIT is used to link together those transcripts that originated from the same gene. When a new cluster is discovered, then a provisional STRAIT record is automatically created. Records become permanent after the addition of further information such as full length sequencing, functional studies and high density hybridisation experiments, which are currently performed in our laboratory. All the above information is organised under an SQL database management system, in a protected intranet environment, currently including more than 4,000 STRAIT records. All the tables are periodically translated into SRS databases and are accessible on the web at HYPERLINK "http://grup.bio.unipd.it/" . The full implementation of the other databases (shown in the figure in light blue) is currently under way. In particular, a series of scripts and automatic procedures have been developed, linking full and partial transcripts to genomic sequences in view of the release of the entire human genome sequence. Our scripts make use of programs such as Blast, GeneFinder and Sim4, to perform this analysis systematically on every transcript of our database. The identification of the genomic sequence allows a simple and exact localisation of the genes and gives an indication of the full length sequence, introns, exons, alternative splicing and promoter region. Similar systematic procedures are also under way to link our muscle transcripts to sequences from model organisms such as yeast, C. elegans, Drosophila and mouse. |
8. Cannata N, Forcato C, Fabbro G, Pasin A, Balen J, Valle G Searching for discriminating degenerated patterns between two populations of sequences Meeting: BITS 2004 - Year: 2004 Full text in a new tab Topic: Unspecified Abstract: In this work we present the development of a bioinformatics tool aiming at the individuation of discriminating sequence patterns between two populations of sequences. Some examples in which it could be used are easy to find in genomics and proteomics: introns/exons in gene sequences, coding/non-coding in transcript sequences, proteins that are transported in some subcellular localization and those that are not. Once the patterns are detected they could be searched over non-annotated sequences from some program especially developed to find degenerated patterns. We expect that such a method, used jointly with other more traditional methods could lead to a better predictive power in annotation processes. |
9. Cannata N, Toppo S, Romualdi C, Valle G Simplifying amino acid alphabets by means of a branch and bound algorithm and substitution matrices Meeting: BIOCOMP 2002 - Year: 2002 Full text in a new tab Topic: Abstract: Missing |
10. Cestaro A, Tosatto SCE, Fogolari F, Toppo S, Valle G CASPITA @ CASP5 Meeting: BIOCOMP 2003 - Year: 2003 Full text in a new tab Topic: Structural genomics Abstract: Missing |
11. Fogolari F, Tosatto SCE, Cestaro A, Valle G, Molinari H Native loop conformation recognition by MM/PBSA energy calculation Meeting: BIOCOMP 2003 - Year: 2003 Full text in a new tab Topic: Structural genomics Abstract: Missing |
12. Fontana P, Segala C, Toppo S, Moser C, Grando S, Valle G, Velasco R Bioinformatics within the IASMA grape project: tools for data mining and sequences annotation Meeting: BIOCOMP 2003 - Year: 2003 Full text in a new tab Topic: Comparative genomics and molecular evolution Abstract: Missing |
13. Lamontanara A, Vitulo N, Albiero A, Forcato C, Campagna D, Dal Pero F, Cattivelli L, Bagnaresi P, Colaiacovo M, Faccioli P, Simkova H, Dolezel J, Perrotta G, Giuliano G, Valle G, Stanca M The repetitive landscape of Wheat Chromosome 5A. A preliminary study based on low-coverage NGS technologies Meeting: Proceedings of BITS 2010 Meeting - Year: 2010 Full text in a new tab Topic: Genomics Abstract: Missing |
14. Lexa M, Valle G Combining rapid word searches with segment-to-segment alignment for sensitive similarity detection, domain identification and structural modelling. Meeting: BITS 2004 - Year: 2004 Full text in a new tab Topic: Structural genomics Abstract: The most popular alignment and similarity search techniques are based on the classical Smith-Waterman scoring scheme. Conservation of a single structural or functional feature between proteins may be undetectable, because the similarities tend to persist only in the key areas, consisting of residues dispersed in a non-trivial manner. We propose a novel method that finds occurrences of short similar words common to the studied sequences and handles the identified matches in a manner similar to segment-to-segment alignment [2]. Our interest in this area stems from the development of programs for fast searches with mismatches in large biological databases [1]. As shown here, these programs can support large database searches that lead to automatic domain detection, sequence annotation. The use of this technique in fold-recognition and structure prediction is being studied. |
15. Lexa M, Zara I, Valle G PRIMEX 1.0 and VPCR 2.0: Processing genomic sequence data for efficient and accurate simulation of PCR reactions with genomic DNA as template Meeting: BIOCOMP 2003 - Year: 2003 Full text in a new tab Topic: Novel algorithms Abstract: Missing |
16. Manavski S, Mariano A, Valle G CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment Meeting: BITS 2007 - Year: 2007 Full text in a new tab Topic: Novel methodologies, algorithms and tools Abstract: Missing |
17. Mittempergher L, Picelli S, Feltrin E, Colluto L, Nofrate V, Caldara F, Millino C, Campanaro S, Valle G Contribution to the ontology and system biology of muscle genes and application to microarray expression studies Meeting: BITS 2006 - Year: 2006 Full text in a new tab Topic: Microarray design and data analysis Abstract: Missing |
18. Picardi E, Horner DS, Chiara M, Schiavon R, Valle G, Pesole G Large scale detection and analysis of RNA editing in grape mtDNA by RNA deep-sequencing Meeting: Proceedings of BITS 2010 Meeting - Year: 2010 Full text in a new tab Topic: Transcriptomics Abstract: Missing |
19. Romualdi C, Celegato B, Campanaro S, Cannata N, Toppo S, Valle G, Lanfranchi G Management and Statistical Analysis of Microarrays Data Meeting: BIOCOMP 2002 - Year: 2002 Full text in a new tab Topic: Abstract: Missing |
20. Toppo S, Cannata N, Romualdi C, Fontana P, Laveder P, Lanfranchi G, Valle G Muscle-TRAIT: an integrated platform for storage, annotation and retrieval of data related to muscle transcripts Meeting: BIOCOMP 2002 - Year: 2002 Full text in a new tab Topic: Abstract: Missing |
21. Toppo S, Fontana P, Cannata N, Scannapieco P, Bertocco E, Valle G TRAIT: a database of transcripts expressed in human skeletal muscle Meeting: BIOCOMP 2001 - Year: 2001 Full text in a new tab Topic: Abstract: Missing |
22. Toppo S, Fontana P, Velasco R, Valle G, Tosatto SCE FOX (FOld eXtractor): A novel protein fold recognition method using iterative PSI-BLAST searches and structural alignments Meeting: BITS 2004 - Year: 2004 Full text in a new tab Topic: Unspecified Abstract: We present a novel fold recognition method based on the combination of detailed sequence searches and structural information. Presently the protocol implements two different approaches to assign the correct fold to the target protein sequence: the first is based on database secondary structure search and the second is based on iterative database sequence search. In the first phase a secondary structure prediction of the target is performed and based on the ConSSPred protocol. This prediction is used to search for hits against a database of known secondary structures extracted from PDB (using DSSP). The search is based on a two-step strategy: the first step is based on a Smith-Waterman local secondary structure similarity search with a specific substitution matrix optimized for secondary structure alignment. The second is based on a global alignment based on SSEA (Secondary Structure Element Alignment), as implemented in our program MANIFOLD, to refine the score and the alignment itself in the region extracted from the first step. At the end of the first phase a list of hits that share a similar secondary structure topology with the target sequence is extracted. The second phase is based on a modified protocol for scanning the sequence database called SENSER. In the beginning of the second phase, BLASTP is used to scan the target sequence against the NR database. These initial hits are clustered to reduce sequence bias and a seed alignment with 20 or fewer sequences generated. This step ensures that PSI-BLAST can be jump-started with a more sensitive initial profile, increasing its sequence diversity. PSIBLAST is run for four iterations (e-value inclusion threshold 10e-3) on the NR60 database of known sequences. NR60 is produced by applying the CD-HIT algorithm to cluster the NR database at 60% sequence identity. Sequences producing NR60 hits with the query are assigned either to the significant sequence space (e-value <= 10e-3) or the trailing end (e-value <= 10) for further use. The profile is used to search the PDBAA database of sequences with known structure. If a significant PDBAA hit (e-value <= 10) is found, the protocol proceeds to the back-validation step (see below). If no significant hit is found, or the hit does not back-validate, a new PSI-BLAST search, using the above "4+1" protocol on NR and PDBAA, is started for the highest ranking sequence (i.e. lowest e-value) in the significant sequence space. Sequences from NR60 matching the query are also assigned to either the significant sequence space or the trailing end. Significant PDBAA hits are again submitted to back-validation. If no significant PDBAA hit is recorded and the significant sequence space has been exhausted, then the protocol uses the trailing end sequences as additional starting points for PSI-BLAST searches. In contrast to previous sequences, which were assumed to be similar enough to the target to imply homology, these sequences are submitted to back-validation before proceeding to the "4+1" PSIBLAST protocol. The back-validation step consists in using PSI-BLAST to find the target starting from a different query sequence, found as described above. I.e. due to the asymmetric nature of PSI-BLAST, if sequence A finds sequence B it is not always the case that B also finds A. Sequences that back-validate are more likely to be correct hits. Once a sequence from PDBAA back-validates and its secondary structures is compatible with the one of the target sequence as found in the first phase, the protocol builds a target to template alignment and stops. The procedure described so far serves to identify a template structure for the target sequence. In order to produce an accurate alignment, HMMER is used to build a hidden Markov model (HMM) based on the HOMSTRAD sequence alignment. The target is then aligned to the template using this HMM. Preliminary results for the method indicate a clear increase in both detection rate and alignment accuracy for distantly homologous sequences. Presently FOX has been tested on Fischer-68 test set to compare its performance with standard PSI-BLAST searches, GenTHREADER and the original SENSER protocol. As expected the introduction of the secondary structure prediction of the protein target and the database secondary structure searches in the first phase have increased detection sensitivity and sensibility of the method compared to profile based searches as PSI-BLAST and SENSER protocol (Fig. 1). The performance is comparable to GenTHREADER showing that right template structure is always found in the top 50 hits as shown in Fig. 1. Further score optimization and development are required to definitely test the entire protocol and make the program available as a web-based server from our group's web site (http://protein.cribi.unipd.it/). |
23. Vitulo N, Cestaro A, Vezzi A, Campanaro S, Simonato F, Lauro F, Malacrida G, Simionati B, Cannata N, Bartlett D, Valle G Development of tools based on UCSC and KEGG for the annotation of the Photobacterium profundum genome Meeting: BITS 2004 - Year: 2004 Full text in a new tab Topic: Unspecified Abstract: One of the critical steps in a genome sequencing project is the efficient data storage and retrieval of the large amount of information produced, which represents the starting point for data analysis and interpretation. We have recently completed the genome sequence of Photobacterium profundum strain SS9 and the data have been implemented in a genome browser under the UCSC enviroment. The UCSC genome browser has been developed at the University of California, Santa Cruz and CRIBI hosts one of their official mirror sites at http://genome.cribi.unipd.it. The sequence and annotation information is stored in a MySQL relational database and a web-based tool performs fast visualization and querying of the data. The records are displayed as a series of tracks aligned with the genomic sequence. The Photobacterium profundum genome browser contains the ORF prediction obtained by two different programs (Orpheus and Glimmer) and the related non-redundant ORF consensus, the ribosome, tRNA, operons, the clones spotted on the microarray chips, the differentially expressed clones derived from microarray experiments, the orthologous genes on other bacteria, the phage and a prediction of the repeated element on the genome. |
24. Vitulo N, Cestaro A, Vezzi A, D'Angelo M, Simonato F, Malacrida G, Campanaro S, Valle G Annotation of Photobacterium profundum genome Meeting: BIOCOMP 2003 - Year: 2003 Full text in a new tab Topic: Databases: ontologies and integration Abstract: Missing |
25. Vitulo N, Vezzi A, Campanaro S, Romualdi C, Lauro F, Valle G A Global Gene Evolution Analysis on Vibrionaceae Family Using Phylogenetic Profile Meeting: BITS 2006 - Year: 2006 Full text in a new tab Topic: Genomics Abstract: Missing |
26. Vitulo N, Vezzi A, D'Angelo M, Cestaro A, Scannapieco P, Valle G Development of new bioinformatic tools for finishing and annotating bacterial genomes: application to the genomic sequence of P. profundum: a barophile/ psycrophile bacterium Meeting: BIOCOMP 2002 - Year: 2002 Full text in a new tab Topic: Abstract: Missing |
27. Zara I, Schiavon R, Valle G Development of new bioinformatic tools to analyze the HLA genetic system Meeting: BIOCOMP 2003 - Year: 2003 Full text in a new tab Topic: Novel algorithms Abstract: Missing |